Two‐phase clustering algorithm with density exploring distance measure
نویسندگان
چکیده
منابع مشابه
K Modes Clustering Algorithm Based on a New Distance Measure
T he leading par tit ional clustering technique, K Modes, is one of the most computationally eff icient clustering methods fo r categ orical data. In the t raditional K Modes algo rithm, the simple matching dissim ilarity measure is used to compute the distance betw een two values of the same catego rical at t ributes. T his compares tw o categorical v alues directly and results in either a dif...
متن کاملClustering with a Domain-Specific Distance Measure
With a point matching distance measure which is invariant under translation, rotation and permutation, we learn 2-D point-set objects, by clustering noisy point-set images. Unlike traditional clustering methods which use distance measures that operate on feature vectors a representation common to most problem domains this object-based clustering technique employs a distance measure specific to ...
متن کاملOntology-based Distance Measure for Text Clustering
Recent work has shown that ontologies are useful to improve the performance of text clustering. In this paper, we present a new clustering scheme on the basis of ontologies-based distance measure. Before implementing clustering process, term mutual information matrix is calculated with the aid of Wordnet and some methods of learning ontologies from textual data. Combining this mutual informatio...
متن کاملGrid Density Clustering Algorithm
Data mining is the method of finding the useful information in huge data repositories. Clustering is the significant task of the data mining. It is an unsupervised learning task. Similar data items are grouped together to form clusters. These days the clustering plays a major role in every day-to-day application. In this paper, the field of KDD i.e. Knowledge Discovery in Databases, Data mining...
متن کاملAuthor clustering with the Aid of a Simple Distance Measure
A simple distance measure has been applied to the author clustering problem to determine which documents are written by the same author. This simple distance measure works with the probability distribution of character sequences of a document, making it insensitive to language differences. The top most frequent features k, where k is chosen to be 300, determine the distribution where punctuatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: CAAI Transactions on Intelligence Technology
سال: 2018
ISSN: 2468-2322,2468-2322
DOI: 10.1049/trit.2018.0006